AITopics | diverse and accurate image description

Collaborating Authors

diverse and accurate image description

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Neural Information Processing SystemsNov-21-2025, 15:08:22 GMT

This paper explores image caption generation using conditional variational auto-encoders (CVAEs). Standard CVAEs with a fixed Gaussian prior yield descriptions with too little variability. Instead, we propose two models that explicitly structure the latent space around K components corresponding to different types of image content, and combine components to create priors for images that contain multiple types of content simultaneously (e.g., several kinds of objects). Our first model uses a Gaussian Mixture model (GMM) prior, while the second one defines a novel Additive Gaussian (AG) prior that linearly combines component means. We show that both models produce captions that are more diverse and more accurate than a strong LSTM baseline or a "vanilla" CVAE with a fixed Gaussian prior, with AG-CVAE showing particular promise.

additive gaussian encoding space, diverse and accurate image description, variational auto-encoder, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Liwei Wang, Alexander Schwing, Svetlana Lazebnik

Neural Information Processing SystemsNov-21-2025, 08:28:27 GMT

This paper explores image caption generation using conditional variational auto-encoders (CV AEs).

ag-cv ae, caption, gmm-cv ae, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Reviews: Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Neural Information Processing SystemsMay-28-2025, 01:05:04 GMT

This paper investigated the task of image-conditioned caption generation using deep generative models. Compared to existing methods with pure LSTM pipeline, the proposed approach augments the representation with an additional data dependent latent variable. This paper formulated the problem under variational auto-encoder (VAE) framework by maximizing the variational lowerbound as objective during training. A data-dependent additive Gaussian prior was introduced to address the issue of limited representation power when applying VAEs to caption generation. Empirical results demonstrate the proposed method is able to generate diverse and accurate sentences compared to pure LSTM baseline.

additive gaussian encoding space, diverse and accurate image description, variational auto-encoder, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Wang, Liwei, Schwing, Alexander, Lazebnik, Svetlana

Neural Information Processing SystemsFeb-14-2020, 17:56:57 GMT

additive gaussian encoding space, diverse and accurate image description, variational auto-encoder, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback